AITopics | causal approach

Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach

Neural Information Processing SystemsDec-24-2025, 20:48:11 GMT

A major challenge in reinforcement learning is to determine which state-action pairs are responsible for future rewards that are delayed. Reward redistribution serves as a solution to re-assign credits for each time step from observed sequences. While the majority of current approaches construct the reward redistribution in an uninterpretable manner, we propose to explicitly model the contributions of state and action from a causal perspective, resulting in an interpretable reward redistribution and preserving policy invariance. In this paper, we start by studying the role of causal generative models in reward redistribution by characterizing the generation of Markovian rewards and trajectory-wise long-term return and further propose a framework, called Generative Return Decomposition (GRD), for policy optimization in delayed reward scenarios. Specifically, GRD first identifies the unobservable Markovian rewards and causal relations in the generative process. Then, GRD makes use of the identified causal generative model to form a compact representation to train policy over the most favorable subspace of the state space of the agent. Theoretically, we show that the unobservable Markovian reward function is identifiable, as well as the underlying causal structure and causal models.

interpretable reward redistribution, name change, reinforcement learning, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.59)
Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Equality of Opportunity in Classification: A Causal Approach

Neural Information Processing SystemsNov-20-2025, 23:17:57 GMT

The Equalized Odds (for short, EO) is one of the most popular measures of discrimination used in the supervised learning setting. It ascertains fairness through the balance of the misclassification rates (false positive and negative) across the protected groups -- e.g., in the context of law enforcement, an African-American defendant who would not commit a future crime will have an equal opportunity of being released, compared to a non-recidivating Caucasian defendant. Despite this noble goal, it has been acknowledged in the literature that statistical tests based on the EO are oblivious to the underlying causal mechanisms that generated the disparity in the first place (Hardt et al. 2016). This leads to a critical disconnect between statistical measures readable from the data and the meaning of discrimination in the legal system, where compelling evidence that the observed disparity is tied to a specific causal process deemed unfair by society is required to characterize discrimination. The goal of this paper is to develop a principled approach to connect the statistical disparities characterized by the EO and the underlying, elusive, and frequently unobserved, causal mechanisms that generated such inequality. We start by introducing a new family of counterfactual measures that allows one to explain the misclassification disparities in terms of the underlying mechanisms in an arbitrary, non-parametric structural causal model. This will, in turn, allow legal and data analysts to interpret currently deployed classifiers through causal lens, linking the statistical disparities found in the data to the corresponding causal processes. Leveraging the new family of counterfactual measures, we develop a learning procedure to construct a classifier that is statistically efficient, interpretable, and compatible with the basic human intuition of fairness. We demonstrate our results through experiments in both real (COMPAS) and synthetic datasets.

classification, disparity, name change, (13 more...)

Neural Information Processing Systems

Industry: Law (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.58)

Add feedback

Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach

Neural Information Processing SystemsOct-11-2024, 14:25:10 GMT

A major challenge in reinforcement learning is to determine which state-action pairs are responsible for future rewards that are delayed. Reward redistribution serves as a solution to re-assign credits for each time step from observed sequences. While the majority of current approaches construct the reward redistribution in an uninterpretable manner, we propose to explicitly model the contributions of state and action from a causal perspective, resulting in an interpretable reward redistribution and preserving policy invariance. In this paper, we start by studying the role of causal generative models in reward redistribution by characterizing the generation of Markovian rewards and trajectory-wise long-term return and further propose a framework, called Generative Return Decomposition (GRD), for policy optimization in delayed reward scenarios. Specifically, GRD first identifies the unobservable Markovian rewards and causal relations in the generative process.

causal approach, interpretable reward redistribution, reinforcement learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.64)

Add feedback

Reviews: Equality of Opportunity in Classification: A Causal Approach

Neural Information Processing SystemsOct-8-2024, 11:10:57 GMT

I acknowledge having read the rebuttal. Fairness is a complicated and an important matter. Due to the nature of the problem, there might not be a universal characterization of it, but if a criterion is proposed it should be followed by a compelling story and a reasonable explanation for why we should consider this criterion. This paper provides a new (causal) interpretation of equalized odds (EO), an associative measure that has been used as a framework to talk about discrimination in classification problems. The central point of the paper is to learn a fair classifier by constraining each of the three (causal) components of EO (i.e.

artificial intelligence, causal approach, machine learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Council Post: Three Ways A Causal Approach Can Improve Trust In AI

#artificialintelligenceNov-2-2021, 03:25:10 GMT

Bernd Greifeneder is the CTO and Founder of Dynatrace, a software intelligence company that helps to simplify enterprise cloud complexity. IT, development and business departments are under more pressure than ever to innovate. However, this has led to applications becoming increasingly complex as organizations move to more dynamic, multicloud environments for greater agility. DevOps and SRE teams need to make sense of this complexity and optimize their services, but this drains the time you can devote to innovation. The move to cloud-native architectures is also making it harder for these teams to quickly identify vulnerabilities.

causal ai, causal approach, decision-making, (17 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.51)

Add feedback

Causal Markov Decision Processes: Learning Good Interventions Efficiently

Lu, Yangyi, Meisami, Amirhossein, Tewari, Ambuj

arXiv.org Machine LearningFeb-15-2021

We introduce causal Markov Decision Processes (C-MDPs), a new formalism for sequential decision making which combines the standard MDP formulation with causal structures over state transition and reward functions. Many contemporary and emerging application areas such as digital healthcare and digital marketing can benefit from modeling with C-MDPs due to the causal mechanisms underlying the relationship between interventions and states/rewards. We propose the causal upper confidence bound value iteration (C-UCBVI) algorithm that exploits the causal structure in C-MDPs and improves the performance of standard reinforcement learning algorithms that do not take causal knowledge into account. We prove that C-UCBVI satisfies an $\tilde{O}(HS\sqrt{ZT})$ regret bound, where $T$ is the the total time steps, $H$ is the episodic horizon, and $S$ is the cardinality of the state space. Notably, our regret bound does not scale with the size of actions/interventions ($A$), but only scales with a causal graph dependent quantity $Z$ which can be exponentially smaller than $A$. By extending C-UCBVI to the factored MDP setting, we propose the causal factored UCBVI (CF-UCBVI) algorithm, which further reduces the regret exponentially in terms of $S$. Furthermore, we show that RL algorithms for linear MDP problems can also be incorporated in C-MDPs. We empirically show the benefit of our causal approaches in various settings to validate our algorithms and theoretical results.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2102.07663

Country:

North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Health Care Technology (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Add feedback

Equality of Opportunity in Classification: A Causal Approach

Zhang, Junzhe, Bareinboim, Elias

Neural Information Processing SystemsFeb-14-2020, 13:11:47 GMT

The Equalized Odds (for short, EO) is one of the most popular measures of discrimination used in the supervised learning setting. It ascertains fairness through the balance of the misclassification rates (false positive and negative) across the protected groups -- e.g., in the context of law enforcement, an African-American defendant who would not commit a future crime will have an equal opportunity of being released, compared to a non-recidivating Caucasian defendant. Despite this noble goal, it has been acknowledged in the literature that statistical tests based on the EO are oblivious to the underlying causal mechanisms that generated the disparity in the first place (Hardt et al. 2016). This leads to a critical disconnect between statistical measures readable from the data and the meaning of discrimination in the legal system, where compelling evidence that the observed disparity is tied to a specific causal process deemed unfair by society is required to characterize discrimination. The goal of this paper is to develop a principled approach to connect the statistical disparities characterized by the EO and the underlying, elusive, and frequently unobserved, causal mechanisms that generated such inequality.

causal approach, classification, disparity, (11 more...)

Neural Information Processing Systems

Industry: Law (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.60)

Add feedback